首页> 外文OA文献 >OpenCL Actors - Adding Data Parallelism to Actor-based Programming with CAF
【2h】

OpenCL Actors - Adding Data Parallelism to Actor-based Programming with CAF

机译:OpenCL actors - 使用。添加数据并行性到基于actor的编程   CaF

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

The actor model of computation has been designed for a seamless support ofconcurrency and distribution. However, it remains unspecific about dataparallel program flows, while available processing power of modern many corehardware such as graphics processing units (GPUs) or coprocessors increases therelevance of data parallelism for general-purpose computation. In this work, we introduce OpenCL-enabled actors to the C++ Actor Framework(CAF). This offers a high level interface for accessing any OpenCL devicewithout leaving the actor paradigm. The new type of actor is integrated intothe runtime environment of CAF and gives rise to transparent message passing indistributed systems on heterogeneous hardware. Following the actor logic inCAF, OpenCL kernels can be composed while encapsulated in C++ actors, henceoperate in a multi-stage fashion on data resident at the GPU. Developers arethus enabled to build complex data parallel programs from primitives withoutleaving the actor paradigm, nor sacrificing performance. Our evaluations oncommodity GPUs, an Nvidia TESLA, and an Intel PHI reveal the expected linearscaling behavior when offloading larger workloads. For sub-second duties, theefficiency of offloading was found to largely differ between devices. Moreover,our findings indicate a negligible overhead over programming with the nativeOpenCL API.
机译:设计了参与者模型以无缝支持并发和分发。但是,它对数据并行程序流仍然不确定,而现代许多核心硬件(例如图形处理单元(GPU)或协处理器)的可用处理能力增加了通用计算的数据并行性的相关性。在这项工作中,我们向C ++ Actor Framework(CAF)介绍了启用OpenCL的actor。这提供了用于访问任何OpenCL设备的高级界面,而无需离开actor范例。新型的actor集成到CAF的运行时环境中,并在异构硬件上引起透明消息传递分布式系统中的消息。遵循CAF中的参与者逻辑,可以将OpenCL内核封装在C ++参与者中而构成它们,从而以多阶段方式对驻留在GPU上的数据进行操作。因此,开发人员可以从原语构建复杂的数据并行程序,而不必离开参与者的范式,也不会牺牲性能。我们对商品GPU,Nvidia TESLA和Intel PHI的评估揭示了在卸载较大的工作负载时预期的线性缩放行为。对于亚秒级任务,发现设备之间的卸载效率差异很大。而且,我们的发现表明与使用nativeOpenCL API进行编程相比,开销可忽略不计。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号